Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[8.x] [SecuritySolution][SIEM migrations] Implement background task API (#197997) #199209

Merged
merged 1 commit into from
Nov 7, 2024

Conversation

semd
Copy link
Contributor

@semd semd commented Nov 6, 2024

Backport

This will backport the following commits from main to 8.x:

Questions ?

Please refer to the Backport tool documentation

…astic#197997)

## Summary

It implements the background task to execute the rule migrations and the
API to manage them. It also contains a basic implementation of the
langGraph agent workflow that will perform the migration using
generative AI.

> [!NOTE]
> This feature needs `siemMigrationsEnabled` experimental flag enabled
to work. Otherwise, the new API routes won't be registered, and the
`SiemRuleMigrationsService` _setup_ won't be called. So no migration
task code can be reached, and no data stream/template will be installed
to ES.

### The rule migration task implementation:

- Retrieve a batch of N rule migration documents (50 rules initially, we
may change that later) with `status: pending`.
- Update those documents to `status: processing`.
- Execute the migration for each of the N migrations in parallel.
- If there is any error update the document with `status: error`.
- For each rule migration that finishes we set the result to the
storage, and also update `status: finished`.
- When all the batch of rules is finished the task will check if there
are still migration documents with `status: pending` if so it will
process the next batch with a delay (10 seconds initially, we may change
that later).
- If the task is stopped (via API call or server shut-down), we do a
bulk update for all the `status: processing` documents back to `status:
pending`.

### Task API

- `POST /internal/siem_migrations/rules` (implemented
[here](elastic/security-team#10654)) ->
Creates the migration on the backend and stores the original rules. It
returns the `migration_id`
- `GET /internal/siem_migrations/rules/stats` -> Retrieves the stats for
all the existing migrations, aggregated by `migration_id`.
- `GET /internal/siem_migrations/rules/{migration_id}` -> Retrieves all
the migration rule documents of a specific migration.
- `PUT /internal/siem_migrations/rules/{migration_id}/start` -> Starts
the background task for a specific migration.
- `GET /internal/siem_migrations/rules/{migration_id}/stats` ->
Retrieves the stats of a specific migration task. The UI will do polling
to this endpoint.
- `PUT /internal/siem_migrations/rules/{migration_id}/stop` -> Stops the
execution of a specific migration running task. When a migration is
stopped, the executing task is aborted and all the rules in the batch
being processed are moved back to pending, all finished rules will
remain stored. When the Kibana server shuts down all the running
migrations are stopped automatically. To resume the migration we can
call `{migration_id}/start` again and it will take it from the same
rules batch it was left.

#### Stats (UI polling) response example:
```
{
    "status": "running",
    "rules": {
        "total": 34,
        "finished": 20,
        "pending": 4,
        "processing": 10,
        "failed": 0
    },
    "last_updated_at": "2024-10-29T15:04:49.618Z"
}
```

### LLM agent Graph

The initial implementation of the agent graph that is executed per rule:

![agent graph
diagram](https://github.com/user-attachments/assets/9228350c-a469-449b-a58a-0b452bb805aa)

The first node tries to match the original rule with an Elastic prebuilt
rule. If it does not succeed, the second node will try to translate the
query as a custom rule using the ES|QL knowledge base, this composes
previous PoCs:
- elastic#193900
- elastic#196651

## Testing locally

Enable the flag
```
xpack.securitySolution.enableExperimental: ['siemMigrationsEnabled']
```

cURL request examples:

<details>
  <summary>Rules migration `create` POST request</summary>

```
curl --location --request POST 'http://elastic:changeme@localhost:5601/internal/siem_migrations/rules' \
--header 'kbn-xsrf;' \
--header 'x-elastic-internal-origin: security-solution' \
--header 'elastic-api-version: 1' \
--header 'Content-Type: application/json' \
--data '[
    {
        "id": "f8c325ea-506e-4105-8ccf-da1492e90115",
        "vendor": "splunk",
        "title": "Linux Auditd Add User Account Type",
        "description": "The following analytic detects the suspicious add user account type. This behavior is critical for a SOC to monitor because it may indicate attempts to gain unauthorized access or maintain control over a system. Such actions could be signs of malicious activity. If confirmed, this could lead to serious consequences, including a compromised system, unauthorized access to sensitive data, or even a wider breach affecting the entire network. Detecting and responding to these signs early is essential to prevent potential security incidents.",
        "query": "sourcetype=\"linux:audit\" type=ADD_USER \n| rename hostname as dest \n| stats count min(_time) as firstTime max(_time) as lastTime by exe pid dest res UID type \n| `security_content_ctime(firstTime)` \n| `security_content_ctime(lastTime)`\n| search *",
        "query_language":"spl",
        "mitre_attack_ids": [
            "T1136"
        ]
    },
    {
        "id": "7b87c556-0ca4-47e0-b84c-6cd62a0a3e90",
        "vendor": "splunk",
        "title": "Linux Auditd Change File Owner To Root",
        "description": "The following analytic detects the use of the '\''chown'\'' command to change a file owner to '\''root'\'' on a Linux system. It leverages Linux Auditd telemetry, specifically monitoring command-line executions and process details. This activity is significant as it may indicate an attempt to escalate privileges by adversaries, malware, or red teamers. If confirmed malicious, this action could allow an attacker to gain root-level access, leading to full control over the compromised host and potential persistence within the environment.",
        "query": "`linux_auditd` `linux_auditd_normalized_proctitle_process`\r\n| rename host as dest \r\n| where LIKE (process_exec, \"%chown %root%\") \r\n| stats count min(_time) as firstTime max(_time) as lastTime by process_exec proctitle normalized_proctitle_delimiter dest \r\n| `security_content_ctime(firstTime)` \r\n| `security_content_ctime(lastTime)`\r\n| `linux_auditd_change_file_owner_to_root_filter`",
        "query_language": "spl",
        "mitre_attack_ids": [
            "T1222"
        ]
    }
]'
```
</details>

<details>
  <summary>Rules migration `start` task request</summary>

- Assuming the connector `azureOpenAiGPT4o` is already created in the
local environment.
- Using the {{`migration_id`}} from the first POST request response

```
curl --location --request PUT 'http://elastic:changeme@localhost:5601/internal/siem_migrations/rules/{{migration_id}}/start' \
--header 'kbn-xsrf;' \
--header 'x-elastic-internal-origin: security-solution' \
--header 'elastic-api-version: 1' \
--header 'Content-Type: application/json' \
--data '{
    "connectorId": "azureOpenAiGPT4o"
}'
```
</details>

<details>
  <summary>Rules migration `stop` task request</summary>

- Using the {{`migration_id`}} from the first POST request response.

```
curl --location --request PUT 'http://elastic:changeme@localhost:5601/internal/siem_migrations/rules/{{migration_id}}/stop' \
--header 'kbn-xsrf;' \
--header 'x-elastic-internal-origin: security-solution' \
--header 'elastic-api-version: 1'
```
</details>

<details>
  <summary>Rules migration task `stats` request</summary>

- Using the {{`migration_id`}} from the first POST request response.

```
curl --location --request GET 'http://elastic:changeme@localhost:5601/internal/siem_migrations/rules/{{migration_id}}/stats' \
--header 'kbn-xsrf;' \
--header 'x-elastic-internal-origin: security-solution' \
--header 'elastic-api-version: 1'
```
</details>

<details>
  <summary>Rules migration rules documents request</summary>

- Using the {{`migration_id`}} from the first POST request response.

```
curl --location --request GET 'http://elastic:changeme@localhost:5601/internal/siem_migrations/rules/{{migration_id}}' \
--header 'kbn-xsrf;' \
--header 'x-elastic-internal-origin: security-solution' \
--header 'elastic-api-version: 1'
```
</details>

<details>
  <summary>Rules migration all stats request</summary>

```
curl --location --request GET 'http://elastic:changeme@localhost:5601/internal/siem_migrations/rules/stats' \
--header 'kbn-xsrf;' \
--header 'x-elastic-internal-origin: security-solution' \
--header 'elastic-api-version: 1'
```
</details>

---------

Co-authored-by: kibanamachine <[email protected]>
(cherry picked from commit cc66320)

# Conflicts:
#	x-pack/plugins/security_solution/common/api/quickstart_client.gen.ts
#	x-pack/plugins/security_solution/server/types.ts
#	x-pack/test/api_integration/services/security_solution_api.gen.ts
@elasticmachine
Copy link
Contributor

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

  • [job] [logs] FTR Configs #48 / discover/esql discover esql view query history should add a failed query to the history
  • [job] [logs] FTR Configs #67 / Saved Objects Management saved objects management with hidden types Delete modal should not delete the hidden objects when performing the operation

Metrics [docs]

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
securitySolution 118 119 +1
Unknown metric groups

API count

id before after diff
securitySolution 186 187 +1

@semd semd requested review from elasticmachine and removed request for kibanamachine and elasticmachine November 7, 2024 10:36
@semd semd merged commit de6da8a into elastic:8.x Nov 7, 2024
38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants